home *** CD-ROM | disk | FTP | other *** search
- 1
- Alert Management Working Group
- Chairperson: Louis Steinberg/IBM
-
-
-
-
-
- CURRENT MEETING REPORT
- Reported by Lee Oattes
-
-
-
- AGENDA
-
-
- o Introduction
-
- o Discussion of draft flow control document
-
- o Preliminary discussion of alert-generation document note: this was
- shelved due to a lack of time
-
-
- ATTENDEES
-
-
- 1. Bierbaum, Neal/vitam6!bierbaum@vitam6
-
- 2. Carter, Glen/gcarter@ddn1.dca.mil
-
- 3. Cohn, George/geo@ub.com
-
- 4. Cook, John/cook@chipcom.com
-
- 5. Denny, Barbara/denny@sri.com
-
- 6. Easterday, Tom/tom@nisca.ircc.ohio-state.edu
-
- 7. Edwards, David/dle@cisco.com
-
- 8. Fedor, Mark/fedor@nisc.nyser.net
-
- 9. Hunter, Steven/hunter@ccc.mfecc.llnl.gov
-
- 10. Kincl, Norman/kincl@iag.hp.com
-
- 11. Malkin, Gary/gmalkin@proteon.com
-
- 12. Oattes, Lee/oattes@utcs.utoronto.ca
-
- 13. Paw, Edison/esp@esd.ecom.com
-
- 14. Replogle, Joel/replogle@ncsa.uiuc.edu
-
- 15. Salo, Tim/tjs@msc.umn.edu
-
- 16. Sheridan, Jim/jsherida@ibm.com
-
- 17. Taft, Vladimir/vtaft@hpinddf.hp.com
-
-
-
- 2
- 18. Waldbusser, Steve/sw0l@andrew.cmu.edu
-
- 19. Wintringham, Dan/danw@osc.edu
- 20. Steinberg, Louis/louiss@ibm.com
-
-
-
- 3
- MINUTES
-
-
- 1. The meeting of the Alert Management Working Group began with an
- introduction from the Chairman (Lou Steinberg).
-
- 2. A discussion of several independent implementations of feedback/pin and
- polled, logged alerts led to an agreement to adopt these mechanisms in
- some form.
-
- 3. The following questions were answered by discussion and consensus:
-
- (a) Can we have a read-only alerts_enabled mib object, by limiting the
- transmission rate of alerts (no shutoff) and not use feedback? No.
- We need a total shutoff mechanism in case a number of alert
- generators are "screaming" all at once. The total traffic might be
- too much for the manager, and this "stable" situation cannot
- improve (while a disabling mechanism would tend to be
- self-correcting).
- Total shutoff implies the use of a resetable, read-write mib
- object.
- An automated, timer-based reset mechanism was discussed but it was
- felt that such a system might tend to sync resets of multiple
- generators and could still lead to an over-reporting condition.
-
- (b) Might an automated-reset of alerts_enabled from the manager station
- create a "blast-off-blast-off..." alert traffic pattern?
- Yes, but such a manager would still tend to only get as much
- traffic as he could handle. A re-enable would only be sent when
- the manager isn't swamped (i.e., is capable of sending one).
- A manager experiencing such a traffic pattern should readjust his
- window prior to setting alerts_enabled TRUE.
-
- (c) When pin disables alerts due to the generation of many similar
- alerts (e.g., link flapping) might we also lose an unrelated alert
- from the same system prior to resetting alerts_enabled?
- Yes, but the rate limiting (as opposed to shutoff) technique has
- the same problem; the probability of sending a single, specific
- alert is much lower than the probability of sending any one of many
- identical alerts.
- This problem is minimized by using polled, logged alerts along with
- feedback/pin (could still lose alerts if log is overwritten).
-
- (d) Should we allow the implementation to decide if alerts are totally
- disabled or limited to a max rate? No.
- Implementations should be consistent since this affects the way we
- manage our alert generators.
-
-
-
- 4
- (e) Can the alert log in polled, logged alerts be overfilled?
- Yes, but the standard suggests that a manager should attempt to
-
- keep the log empty by removing known alerts.
- If an individual implementation has no mechanism for removing old
- alerts (no set) then the log must wrap when full and the manager
- might lose alerts.
-
- (f) If using the SNMP get-next, do we want the oldest logged element
- first, or the newest first?
- Clearly the manager wants the oldest first if a full log will
- wrap...this gives him the most chance to see the oldest alert (in a
- full log) before losing it.
- No real concensus here. It seems as though this should be
- implementation specific since it only applies to SNMP, and since
- the log, actually being a table, makes this a question of "are new
- table entries added at the table top or bottom?".
-
- (g) Can we shrink the log size by stripping out only the "important"
- information from each alert?
- We can, but this is something we decided we shouldn't do. It
- requires a different parser at the manager (can't run it through
- the alert parser), and we did not know how do decide what
- information might be needed (it varied with the protocol and alert
- type).
-
- (h) How about only logging alerts, and sending an "alert logged" alert
- for each new log entry? The manager gets the asynch. "alert
- logged" notice and reads the alert log to determine what happened.
- While this is an interesting concept, it was felt that it might
- tend to aggravate some of the other logging problems (e.g., if the
- log is filled and not over-writing, the only chance of getting the
- alert information is from the async alert...this removes the asynch
- alert information and replaces it with "see the log" information).
-
- (i) A discussion of the cpu cycles and memory needed for keeping a log
- followed. Since the log size might be settable (to 0) it was felt
- that systems could allow managers to disable logging. It was also
- felt that the performance and memory hits were not large, but
- numbers to confirm this were not available.
-
- 4. The following were decided by vote:
-
- (a) Feedback/Pin
- Mandatory mib objects:
-
-
-
- 5
- alerts_enabled read/ write
- window (time) read/ optional write
-
- max_alerts read/ optional write
- Do not include alert counters as mib objects for this document.
- Individual implementors will decide if they need total dropped
- and/or sent, but not everybody likes the idea of adding more
- counters as (even optional) mib objects.
- Do not optionally allow a reduced rate mode on the over reporting
- condition...require total async. Alerts to be shutoff for reasons
- given in earlier discussion.
-
- (b) Polled, Logged Alerts Remove time field from the table, as most
- alerts are time stamped and the information in an alert should be
- defined by the protocol...not us.
-